IEEE Transactions on Medical Imaging
● Institute of Electrical and Electronics Engineers (IEEE)
Preprints posted in the last 30 days, ranked by how well they match IEEE Transactions on Medical Imaging's content profile, based on 18 papers previously published here. The average preprint has a 0.02% match score for this journal, so anything above that is already an above-average fit.
Li, S.; Gao, J.; Kim, C.; Choi, S.; Chen, Q.; Wang, Y.; Wu, S.; Zhang, Y.; Huang, T.; Zhou, Y.; Yao, B.; Yao, Y.; Li, C.
Show abstract
Three-dimensional (3D) handheld photoacoustic tomography typically relies on bulky and expensive external positioning trackers to correct motion artifacts, which severely limits its clinical flexibility and accessibility. To address this challenge, we present PA-SfM, a tracker-free framework that leverages exclusively single-modality photoacoustic data for both sensor pose recovery and high-fidelity 3D reconstruction via differentiable acoustic radiation modeling. Unlike traditional Structure-from-Motion (SfM) methods that formulate pose estimation as a geometry-driven optimization over visual features, PA-SfM integrates the acoustic wave equation into a differentiable programming pipeline. By leveraging a high-performance, GPU-accelerated acoustic radiation kernel, the framework simultaneously optimizes the 3D photoacoustic source distribution and the sensor array pose via gradient descent. To ensure robust convergence in freehand scenarios, we introduce a coarse-to-fine optimization strategy that incorporates geometric consistency checks and rigid-body constraints to eliminate motion outliers. We validated the proposed method through both numerical simulations and in-vivo rat experiments. The results demonstrate that PA-SfM achieves sub-millimeter positioning accuracy and restores high-resolution 3D vascular structures comparable to ground-truth benchmarks, offering a low-cost, softwaredefined solution for clinical freehand photoacoustic imaging. The source code is publicly available at https://github.com/JaegerCQ/PA-SfM.
Maidu, B.; Gonzalo, A.; Guerrero-Hurtado, M.; Bargellini, C.; Martinez-Legazpi, P.; Bermejo, J.; Contijoch, F.; Flores, O.; Garcia-Villalba, M.; McVeigh, E.; Kahn, A.; del Alamo, J. C.
Show abstract
Atrial fibrillation (AF) promotes blood stasis and thrombus formation, most often within the left atrial appendage (LAA), and can lead to stroke or transient ischemic attack (TIA). Time-resolved contrast-enhanced computed tomography (4D CT) captures left atrial (LA) opacification and washout, but it does not directly provide quantitative stasis metrics such as blood residence time. Patient-specific computational fluid dynamics (CFD) can quantify LA/LAA residence time, yet routine clinical use is limited by computational cost and sensitivity to patient-specific boundary conditions. Here, we present two complementary approaches to infer time-resolved 3D residence time fields directly from contrast dynamics. First, a physics-informed neural network (PINN) treats contrast as a passive scalar and jointly reconstructs velocity and residence time by enforcing the incompressible Navier-Stokes equations and transport equations for contrast concentration and residence time in moving, patient-specific LA anatomies. Second, an indicator dilution theory (IDT) formulation computes voxelwise, time-resolved residence time maps from contrast time curves alone by constructing a PV-referenced impulse response and modeling transport with a tank-in-series model with spatially dependent parameters. Both methods are benchmarked against patient-specific CFD in six cases spanning diverse LA function, including three patients with TIA or thrombus in the LAA and three patients free of events. Both approaches reproduce expected spatial and temporal trends, with higher residence time in the distal LAA and higher LAA residence time in cases with TIA or thrombus. IDT demonstrates the closest agreement with CFD across the full range of residence times and produces maps in seconds, facilitating clinical translation. In contrast, the PINN additionally recovers phase-dependent atrial flow structures, but tends to smooth and underestimate the highest residence-time regions and requires hours of training. Together, these results support a scalable workflow in which IDT enables rapid stasis screening from contrast CT, and PINNs provide a complementary pathway for detailed, patient-specific hemodynamic inference when full-field flow information is needed.
Ma, S.; Xu, M.; Dao, M.; Li, H.
Show abstract
Microscopy-based analysis of red blood cell (RBC) morphology is widely used to study phenotypes in sickle cell disease (SCD). Although AI models have been developed to automate classification, most are trained on pre-cropped single-cell images and thus struggle with full-scope microscopic images containing densely packed cells and diverse morphologies, which require both accurate detection and fine-grained classification. We propose an end-to-end computational framework to identify individual RBCs in full-scope microscopy images and classify them into five morphological categories: discocytes (DO), echinocytes (E), elongated and sickle-shaped cells (ES), granular cells (G), and reticulocytes (R). We first evaluate advanced detection-classification models, including You Only Look Once (YOLO) and Detection Transformers (DETR), and demonstrate that while these models effectively detect cells, their classification performance falls short of specialized classifiers trained on single-cell images, particularly for minority phenotypes. To address this limitation, we introduce a two-step framework in which a YOLO-based detector localizes and crops individual cells from full-scope images, followed by a fine-tuned DenseNet121 ensemble classifier that assigns each cell to one of the five morphological categories. The proposed framework achieves a detection-level F1-score of 0.9661 and a weighted-average classification F1-score of 0.9708, with an overall classification accuracy of 97.06%. Compared with the single-step YOLO26n baseline, the two-step pipeline yields a macro-average F1-score improvement of +0.1675, with particularly substantial gains for minority classes (E: +0.1623; G: +0.2774; R: +0.2603). Overall, this hybrid framework demonstrates a practical strategy for adapting fast, general-purpose detection models to domain-specific biomedical tasks by combining them with specialized classifiers, delivering both efficiency and high accuracy for scientific and clinical image analysis.
Qiu, P.; An, Z.; Ha, S.; Kumar, S.; Yu, X.; Sotiras, A.
Show abstract
Multimodal medical image analysis exploits complementary information from multiple data sources (e.g., multi contrast Magnetic Resonance Imaging (MRI), Diffusion Tensor Imaging (DTI), and Positron Emission Tomography (PET)) to enhance diagnostic accuracy and support clinical decision making. Central to this process is the learning of robust representations that capture both modality invariant and modality specific features, which can then be leveraged for downstream tasks such as MRI segmentation and normative modeling of population level variation and individual deviations. However, learning robust and generalizable representations becomes particularly challenging in the presence of missing modalities and heterogeneous data distributions. Most existing methods address this challenge primarily from a statistical perspective, yet they lack a theoretical understanding of the underlying geometric behavior such as how probability mass is allocated across modalities. In this paper, we introduce a generalized geometric perspective for multimodal representation learning grounded in the concept of barycenters, which unifies a broad class of existing methods under a common theoretical perspective. Building on this barycentric formulation, we propose a novel approach that leverages generalized Wasserstein barycenters with hierarchical modality specific priors to better preserve the geometry of unimodal distributions and enhance representation quality. We evaluated our framework on two key multimodal tasks brain tumor MRI segmentation and normative modeling demonstrating consistent improvements over a variety of multimodal approaches. Our results highlight the potential of scalable, theoretically grounded approaches to advance robust and generalizable representation learning in medical imaging applications.
Mauri, C.; Mckenzie, A.; Analoro, C.; Yeon, E.; Coviello, R.; Mora, J.; Chollet, E.; Deden Binder, L.; Mahar, A.; Lin, S.; Benlahcen, M.; Ream, A.; Jama, A.; Garcia, I.; Tran, N.; Onta, P.; Wood, S.; Willis, A.; Mahmood, A.; Sinoballa, G.; Malki, A.; Tran, K.; Malireddy, V.; Onumajuru, N.; Lakshmanan, S.; Hercules Landaverde, K.; Sidow, R.; Wood, D.; Nguyen, B.; Hernandez, J.; Bernier, M.; Hunter, J.; Malki, A.; Tum, A.; Chavez, V.; Shahu, Z.; Vasi, I.; Visser, A.; Ghaouta, Z.; Bond, F.; Vigneshwaran, R.; Kirkpatrick, E.; Avalos Barbosa, M.; Rauh, K.; Herisse, R.; Garcia Pallares, E.; Zeng, X.
Show abstract
The cerebral vasculature is central to brain function, with alterations linked to numerous cerebrovascular and neurological disorders. Yet, no single imaging modality can capture the entire cerebral vascular network in humans. Instead, an array of techniques are sensitized to different spatial scales, while trading off resolution for coverage. Magnetic Resonance Imaging (MRI) typically resolves only large pial vessels, while high-resolution microscopy allows micrometer-scale vessels to be mapped over limited spatial extents. These techniques must therefore be combined to obtain a complete mapping of the cerebral angioarchitecture, which underscores the need for automatic, cross-modal vessel segmentation. Here, we introduce VesSynth, a flexible vessel segmentation framework that achieves state-of-the-art accuracy across multiple modalities and spatial resolutions (MR, optical and X-ray imaging), despite being trained entirely on synthetic data. By enabling consistent vascular mapping across scales, this framework paves the way to comprehensive investigation of cerebrovascular organization and its role in health and disease.
Djebbara, I.; Yin, Z.; Friismose, A. I.; Poulsen, F. R.; Hojo, E.; Aunan-Diop, J. S.
Show abstract
Mechanical properties of biological tissues vary across spatial scales, yet radiomics typically relies on fixed, heuristic choices for neighbourhood size, kernel geometry, and spectral content - choices that can silently reshape the feature space before any modelling begins. We introduce a label-free, information-theoretic framework for selecting extraction parameters in multi-frequency MRE radiomics. For each configuration {theta} - neighbourhood radius r, kernel geometry k (sphere or shell), and frequency subset f - we extract a radiomics feature matrix and score it using an objective J({theta}) that integrates distributional richness (Shannon entropy), cross-frequency coherence (canonical correlation), inter-feature redundancy (Spearman correlation), and bootstrap stability. We evaluate 121 configurations per tissue in multi-frequency MRE (30-60 Hz) of human brain, liver, and a calibrated phantom, and test robustness using 10,000 Dirichlet-sampled objective weightings. Across tissues, neighbourhood aggregation is consistently preferred over voxel-wise extraction, outperforming the no-neighbourhood baseline in 98.4-100% of weightings. External validation in 100 independent brain scans acquired with a different protocol and wider frequency range (20-90 Hz) confirms a reproducible mesoscopic plateau at r = 3-5 (9-15 mm), with a modal optimum at r = 4; omitting neighbourhood analysis reduces J({theta}) by 38% relative to each subject's optimum. Frequency-subset preferences replicate across datasets, with lower frequencies most frequently selected for brain. By turning ad hoc extraction choices into an outcome-free optimisation step, this framework improves reproducibility, reduces sensitivity to heuristic parameter choices, and generalises across acquisition protocols and imaging sites.
Bizjak, Z.; Zagar, J.; Spiclin, Z.
Show abstract
Automated and reliable image quality assessment (IQA) is essential for safe use of medical image synthesis in critical applications like adaptive radiotherapy, treatment planning, or missing-modality reconstruction, where unnoticed generative artifacts may adversely affect outcomes. We evaluated image-to-image translation quality by coupling large-scale expert visual quality assessment with explainable automated IQA modeling. Adversarial diffusion-based framework, SynDiff, was applied to four cross-modality synthesis tasks, including three inter-MR and a CBCT-to-CT translation. Using four-fold cross-validation, ten reference-based and eight no-reference IQA metrics were computed for all synthesized images. Visual IQA ratings were independently collected from thirteen expert raters using predetermined protocol and specialized image viewer enabling blinded, randomized six-point Likert scoring. Auto-Sklearn was employed to learn ensemble regression models mapping IQA metrics to visual consensus ratings, with separate models trained on reference-based and no-reference metrics. The models closely reproduced distribution and ordering of expert ratings, typically within +/- 0.5 Likert points. Reference-based models achieved higher agreement with visual ratings than no-reference models (R^2 0.75 vs. 0.59, resp.), although the latter remained unbiased and informative. Explainability analyses highlighted structure- and contrast-sensitive metrics as key predictors. Overall, the results demonstrate that ensemble regression models can provide transparent, scalable, and clinically meaningful quality control for generative medical imaging.
Lu, H.-E.; Koivisto, D.; Lou, Y.; Zeng, Z.; Yu, T.; Wang, J.; Meng, X.; Nowikow, C.; Wilson, R.; Kumbhare, D.; Pu, J.
Show abstract
Deep learning has transformed medical image and video analysis, but it usually requires large, well annotated datasets. In many clinical domains, especially when testing novel mechanistic hypotheses, such retrospective datasets are hard to obtain since acquiring adequate cohorts is time intensive, costly, and operationally difficult. This creates a critical translational gap: scientifically compelling early stage ideas may remain untested due to lack of sufficient sample size to support conventional deep learning pipelines. Developing data-efficient strategies for evaluating new hypotheses within small prospective cohorts is therefore essential to de-risk innovation before large-scale validation. Myofascial Pain Syndrome (MPS) exemplifies this challenge, as quantitative ultrasound imaging biomarkers for MPS remain underexplored. We investigated whether MPS in the upper trapezius can be detected from full B-mode ultrasound videos in a small prospective cohort (11 controls, 13 patients). Videos were automatically preprocessed and resampled using a sliding window strategy to expand training samples (404 clips). A self-supervised Video Diffusion Encoder (VDE) is developed to learn spatiotemporal representations without relying on extensive labeled data, and compared it with transfer-learning-based ResNet, VideoMAE, and SimCLR. Using subject-level stratified four-fold cross-validation, the VDE outperformed transfer learning baselines and achieved performance comparable to SimCLR, with subject-level AUC of 0.79 and accuracy of 0.86, and no significant differences between latent-only and combined trigger point analyses. These results demonstrate that self-supervised diffusion learning can support robust, data-efficient deep learning in small prospective studies, enabling early feasibility testing of innovative ultrasound biomarkers before large-scale clinical trials.
Taherkhani, M.; Pizzolato, M.; Morup, M.; Dyrby, T. B.
Show abstract
Diffusion-weighted magnetic resonance imaging (dMRI) is used to study white matter microstructure and to delineate pathways by estimating fiber orientation distributions (FODs). Symmetric FODs represent the conventional model assuming antipodal symmetry in water diffusion. However, in complex regions with bending, branching or fanning fibers, this assumption is not guaranteed. To better capture such underlying fibers geometries, asymmetric FODs (A-FODs), derived from neighboring FODs, have been introduced. Here, we propose an Encoder-based Curvature-Aware Regularization (EnCAR) method for estimating A-FODs. Incorporating curvature features into the regularization weight applied to neighboring voxels improves reconstruction of A-FODs. A self-supervised Transformer network, combined with a Spherical Harmonics Semantic Encoder, learns region-specific regularization parameters from this local neighborhood to capture the diversity of fiber geometries across the brain. The EnCAR method was verified on the DiSCo challenge phantom, and applied to in vivo multi-shell Human data. The model estimated sharp, high-angular-resolution A-FODs that were well aligned with local fiber pathway. Compared with established FOD and A-FOD methods, it performed on par in regions dominated by symmetric FODs and outperformed them in complex asymmetric regions. Quantitative evaluation using the Asymmetry Index (ASI) and Model Discrepancy Index (MDI) confirmed improved consistency with the underlying diffusion signals. By ensuring smooth directional transitions, this work enhances the visibility of continuous fiber segments.
Chandra, S.
Show abstract
Background: Current deep learning models in computational pathology, radiology, and digital pathology produce opaque predictions that lack the explainable artificial intelligence (xAI) capabilities required for clinical adoption. Despite achieving radiologist-level performance in tasks from whole-slide image (WSI) classification to mammographic screening, these models function as black boxes: clinicians cannot trace predictions to specific biological features, verify outputs against established morphological criteria, or integrate AI reasoning into precision oncology workflows and tumor board decision-making. Methods: We present Virtual Spectral Decomposition (VSD), a modality-agnostic, interpretable-by-design framework that decomposes medical images into six biologically interpretable tissue composition channels using sigmoid threshold functions - the same mathematical structure as CT windowing. Unlike post-hoc xAI methods (Grad-CAM, SHAP, LIME) applied to black-box deep learning models, VSD channels have pre-defined biological meanings derived from tissue physics, providing inherent explainability without sacrificing quantitative rigor. For whole-slide image (WSI) analysis in digital pathology, we introduce the dendritic tile selection algorithm, a biologically-inspired hierarchical architecture achieving 70-80% computational reduction while preferentially sampling the tumor immune microenvironment. VSD is validated across three cancer types and imaging modalities: pancreatic ductal adenocarcinoma (PDAC) on CT imaging, lung adenocarcinoma (LUAD) on H&E-stained pathology slides using TCGA data, and breast cancer on screening mammography. Composition entropy of the six-channel vector is computed as a visual Biological Entropy Index (vBEI) - an imaging biomarker quantifying the diversity of active biological defense systems. Results: In pancreatic cancer, the fat-to-stroma ratio (a novel CT-derived radiomics biomarker) declines from >5.0 (normal) to <0.5 (advanced PDAC), enabling early detection of desmoplastic invasion before mass formation on standard imaging. In lung cancer, composition entropy from H&E whole-slide images correlates with tumor immune microenvironment markers from RNA-seq (CD3: rho=+0.57, p=0.009; CD8: rho=+0.54, p=0.015; PD-1: rho=+0.54, p=0.013) and predicts overall survival (low entropy immune-desert phenotype: 71% mortality vs 29%, p=0.032; n=20 TCGA-LUAD), providing immune phenotyping for checkpoint immunotherapy patient selection from a $5 H&E slide without molecular assays. In breast cancer, each lesion type produces a characteristic six-channel fingerprint functioning as an interpretable computer-aided diagnosis (CAD) system for quantitative BI-RADS assessment and subtype classification (IDC vs ILC vs DCIS vs IBC). A five-level xAI audit trail provides complete traceability from clinical decision support output to specific biological structures visible on the original images. Conclusion: VSD establishes a unified, interpretable-by-design mathematical framework for explainable tissue composition analysis across imaging modalities and cancer types. Unlike black-box deep learning and post-hoc xAI approaches, VSD provides inherently interpretable, clinically verifiable cancer detection and immune phenotyping from standard clinical imaging at existing costs - without requiring foundation model infrastructure, specialized hardware, or molecular assays. The open-source pipeline (Google Colab, Supplementary Material) enables immediate reproducibility and extension to additional cancer types across the pan-cancer TCGA atlas.
Boiardi, F. E.; Lain, A. D.; Posma, J. M.
Show abstract
Pneumonia detection in chest X-rays (CXRs) is complicated by high inter-observer variability and overlapping radiographic patterns. While deep learning (DL) solutions show promise, limitations in generalisability and explainability hinder clinical adoption. We address these challenges by introducing a holistic DL-based computer-aided diagnosis (CAD) pipeline for pneumonia detection, localisation, and structured report generation from CXRs. We curated the largest composite of publicly available CXRs to date (N=922,634), of which [Formula] were used for training. MIMIC-CXR radiology reports were relabelled using a local large language model (LLM), positing that LLM-derived pneumonia labels would yield higher diagnostic sensitivity than the provided rule-based natural language processing (rNLP) labels. DenseNet-121 classifiers were trained on four configurations: MIMIC-CXR (rNLP), MIMIC-CXR (LLM), and each supplemented with VinDr-CXR data. Gradient-weighted Class Activation Mapping (Grad-CAM) provided visual explainability and lung zone-based localisation. LLM-driven relabelling significantly improved human-label agreement (96.5% vs 72.5%, P=1.66x10-11). The best-performing model (MIMIC-CXR (LLM) + VinDr-CXR) achieved 82.08% sensitivity and 81.97% precision, surpassing both radiologist sensitivity ranges (64-77.7%) and CheXNets pneumonia F1-score (43.5%). Grad-CAM localisation attained a moderate F1-score of 52.9% (sensitivity=65.7%, precision=44.3%), confirming focus alignment with pathological lung regions while highlighting areas for refinement. These findings demonstrate that LLM-driven label curation, combined with DL, can exceed conventional rNLP and radiologist performance, advancing high-quality data integration in predictive medical imaging. Clinically, our pipeline offers rapid triage, automated report drafting, and real-time pneumonia surveillance; tools that can streamline radiology workflows and mitigate diagnostic errors.
Avaria-Saldias, R. H.; Ortiz, D.; Palma-Espinosa, J.; Cancino, A.; Cox, P.; Salas, R.; Chabert, S.
Show abstract
Accurate characterisation of the haemodynamic response function (HRF) is central to interpreting blood-oxygen-level-dependent (BOLD) signals in functional magnetic resonance imaging, yet standard estimation approaches remain centred around phenomenological formulations lacking biophysical grounding. We present a physics-informed neural network (PINN) framework that bridges these paradigms by embedding the Balloon-Windkessel model directly into the training objective of a multi-headed Neural Network. Our aproach simultaneously estimates probable latent neurovascular state variables such as cerebral blood inflow, metabolic rate of oxygen consumption, blood volume, and deoxyhaemoglobin content, through an indirect optimisation scheme in which the predicted BOLD signal is obtained via convolution of the estimated HRF with experimental stimuli. Training is governed by a composite loss, balancing differential-equation residuals, physiological initial conditions and data fidelity. In simulations with temporal signal-to-noise ratios representative of clinical acquisitions, the framework recovered ground-truth state variables with coefficients of determination exceeding 0.99 and mean squared errors below 10-3, at a physics-to-data weighting of 0.40:0.60. Application to 1.5 T block-design fMRI data from an ischaemic stroke patient yielded physiologically plausible, subject-specific HRF estimates, establishing feasibility of single-subject, physics-constrained HRF inference without reliance on fixed gamma basis assumptions.To our knowledge, this constitutes the first deployment of a single PINN incorporating the full Balloon-Windkessel model within an indirect training objective, reconstructing full BOLD observations, positioning PINN-based haemodynamic modelling as a principled and personalised route towards more interpretable and patient-specific fMRI biomarkers.
Noetscher, G.; Miles, A.; Danskin, B.; Tang, D.; Ingersoll, M.; Nunez Ponasso, G. C.; Paxton, C.; Ludwig, R.; Burnham, E.; Deng, Z.-D.; Lu, H.; Weise, K.; Knösche, T.; Rosen, B.; Bikson, M.; Makaroff, S. N.
Show abstract
Electrical conductivity of cortical gray matter governs the magnitude and spatial distribution of electric fields generated by brain stimulation and intrinsic neuronal activity measured with M/EEG and intracortical recordings. However, reported macroscopic conductivity values vary by more than threefold, limiting the fidelity of bioelectromagnetic models and leaving unresolved whether this variability reflects measurement uncertainty or genuine structural heterogeneity of cortical tissue. Here, we present a multiscale computational framework that, for the first time, attempts to derive mesoscale conductivity maps of mouse visual cortex at 50-{micro}m resolution directly from large-volume, segmented nanometer-scale electron microscopy data. The Minnie 65 subvolume of the MICrONS dataset is accurately subdivided into 1,224 50-{micro}m cubic blocks. Each block contains, on average, 40-50 million membrane facets of a highly convoluted and dense cellular structure. Three orthogonal electrode pairs are applied to each isolated block to estimate the three principal components of the conductivity tensor. Quasistatic electric modeling is enabled by an iterative boundary-element fast multipole method (BEM-FMM) under the approximation of non-conducting membranes (DC conductivity). Spatially averaged conductivity values predicted by our framework agree well with prior low-resolution measurements in rats, validating the approach. At the same time, the resulting mesoscale maps reveal pronounced conductivity granularity at 50-100 {micro}m scales as well as significant variations in both radial and tangential directions. These results indicate that mesoscale conductivity heterogeneity could be an intrinsic structural property of the cortex. Limitations and extensions of this study are discussed in detail.
Gao, Z.; Han, K.; Ling, Z.; Zhang, H.; Botchwey, E.; Liu, W.; Hua, X.; Nie, S.; Jia, S.
Show abstract
Optical scattering in biological tissues fundamentally limits fluorescence imaging by disrupting spatial and angular information, thereby restricting volumetric visualization. Although hardware-intensive and computational approaches have advanced scattering microscopy, practical three-dimensional imaging through tissue remains constrained by instrumental complexity and axial ambiguity. Here, we present volumetric scattering microscopy (VSM), a scan-free, optical-computational framework for three-dimensional fluorescence imaging in scattering biological media. VSM captures angularly resolved speckle-encoded fluorescence using an aperture-segmented Fourier light-field configuration and reconstructs volumetric structure through adaptive feature-based descattering and joint sub-pupil alignment. This hybrid strategy preserves angular information embedded in scattered light without wavefront measurement or mechanical scanning, while maintaining the simplicity of a standard epi-fluorescence architecture. We demonstrate high-fidelity volumetric reconstruction across phantoms, engineered cellular systems, ex vivo tissues with volumetric muscle loss, and intact Xenopus embryos, achieving preserved spatial resolution, enhanced optical sectioning, and quantitative accuracy under strong scattering conditions. VSM supports large-field, robust volumetric imaging in both layered and fully embedded scattering environments. By transforming scattered light into a structured encoding resource, VSM establishes a scalable pathway toward routine three-dimensional fluorescence imaging in complex biological systems.
Valijonov, J.; Soar, P.; Le Houx, J.; Tozzi, G.
Show abstract
Digital volume correlation (DVC) has become the benchmark experimental technique for full-field strain measurement in bone mechanics. In our previous work we developed a novel data-driven image mechanics (D2IM) approach that learns from DVC data and predicts displacement fields directly from undeformed X-ray computed tomography (XCT) images, deriving strain fields from such predictions. However, strain fields derived through numerical differentiation of displacement fields amplify high-frequency noise, and regularization techniques compromise spatial resolution while incurring substantial computational costs. Here we propose the upgrade D2IM-Strain to predict strain fields directly from XCT images of bone. Two prediction strategies were compared: displacement-derived strain and direct strain prediction. The direct strain prediction model significantly improved accuracy particularly for strain magnitudes below 10000{micro}{varepsilon}, taken as a representative threshold value for bone tissue yielding in compression. In addition, the direct approach reduced false-positive high-strain classifications by 75%. By eliminating numerical differentiation, the approach reduces noise amplification while maintaining computational efficiency. These findings represent a critical step toward developing robust data-driven volume correlation methods for hierarchical materials.
Chandra, S.
Show abstract
Background. Pancreatic ductal adenocarcinoma (PDAC) has a five-year survival rate of approximately 12%, largely because it is typically diagnosed at an advanced stage. CT-based computational methods for early detection exist but rely on black-box deep learning or large texture feature sets without tissue-specific interpretability. Methods. We developed Virtual Spectral Decomposition (VSD), which applies six parameterized sigmoid functions S(HU) = 1/(1+exp(-alpha x (HU - mu))) to standard portal-venous CT, decomposing each pixel into tissue-specific response channels for fat (mu=-60), fluid (mu=10), parenchyma (mu=45), stroma (mu=75), vascular (mu=130), and calcification (mu=250). Dendritic Binary Gating identifies structural content per channel using morphological filtering, enabling co-firing analysis and lone firer identification. A 25-feature signature was extracted per patient. Three independent datasets were analyzed: NIH Pancreas-CT (n=78 healthy), Medical Segmentation Decathlon Task07 (n=281 PDAC, paired tumor/adjacent tissue), and CPTAC-PDA from The Cancer Imaging Archive (n=82, multi-institutional, with DICOM time point tags). The same six sigmoid parameters were used across all datasets without retraining. Results. VSD achieved AUC 0.943 for field effect detection (healthy vs cancer-adjacent parenchyma) and AUC 0.931 for patient-stratified tumor specification on MSD. On CPTAC-PDA, VSD achieved AUC 0.961 (6 features) and 0.979 (25 features) for distinguishing healthy from cancer-bearing pancreas on scans obtained prior to pathological diagnosis. All significant features replicated across datasets in the same direction: z_fat (d=-2.10, p=3.5e-27), z_fluid (d=-2.76, p=2.4e-38), fire_fat (d=+2.18, p=1.2e-28). Critically, VSD severity did not correlate with days-from-diagnosis (r=-0.008, p=0.944) across a range of day -1394 to day +249. Patient C3N-01375, scanned 3.8 years before pathological diagnosis, had VSD severity 1.87, well above the healthy mean of 0.94 +/- 0.33. The tissue transformation signature was temporally stable, indicating an early, persistent tissue state rather than a progressively worsening process. Conclusions. VSD with Dendritic Binary Gating detects a stable pancreatic tissue composition signature on standard CT that is present years before clinical diagnosis, validated across three independent datasets without parameter adjustment. The six sigmoid channels map to biologically meaningful tissue components through a fully transparent interpretability chain. The temporal stability of the signal implies a detection window of 3-7 years, consistent with known PanIN-3 microenvironment transformation timelines. VSD functions as a single-scan screening tool applicable to any abdominal CT performed during the pre-clinical window.
Nguyen, H.; Li, C.; Peng, C.; Simpson, P.; Ye, N.; Nguyen, Q.
Show abstract
Foundation models for computational pathology have rapidly emerged as powerful tools for extracting rich biological and morphological representations from histopathology images. However, variations in model architecture, pre-training data, and optimization objectives often lead to task-dependent performance, rather than universal generalization. As a result, effective strategies for integrating their complementary strengths are essential to fully realize the potential of foundation models for robust histopathology analysis. Meanwhile, recent breakthroughs such as spatial transcriptomics provide an unprecedented opportunity to integrate genetic and histopathology information from the same patient sample, thereby maximizing both molecular and anatomical pathology insights. Specifically, each models embedding is first mapped to gene-level predictions via a dedicated prediction head, enabling model-specific feature utilization. A lightweight weighting network then adaptively aggregates these predictions to produce a unified and robust output at gene and spatial location levels. Across multiple spatial transcriptomics datasets, our approach consistently outperforms both individual foundation models and classical ensembling methods. Focusing on breast cancer, we observe substantial gains in prediction accuracy for clinically relevant PAM50 subtype markers and drug-target genes. Moreover, the proposed framework improves interpretability by revealing model-specific contributions and specialization at the gene level. Overall, our work presents an effective solution to integrating multiple foundation models for enhancing the genetic analyses of histopathology images.
Shang, W.; Hong, G.; Keller, W. E.; Morton, R. A.; Zeboulon, P.; Kenichi, T.; Duan, X.; Gould, D. B.; Kim, T. N.
Show abstract
The neurosensory retina is one of the most metabolically active tissues in the body and a uniquely accessible extension of the central nervous system, where neuronal and vascular structures can be visualized non-invasively. Its accessibility and highly organized laminar architecture make it a powerful model for studying vascular development and a window into systemic health. Although computational analyses of retinal images have enabled risk assessment for ocular and systemic diseases, most vascular studies rely on two-dimensional frameworks with limited resolution of capillary structure and layer-specific organization. Here, we present a high-resolution three-dimensional (3D) imaging and analysis pipeline enabling quantification of retinal microvasculature and extraction of structural and network metrics across vascular layers. We apply this approach to two mouse models of aberrant retinal vascular development: one with spontaneous postnatal chorioretinal neovascularization and another with disrupted neurovascular lattice formation and layered organization in early life. Across both pathologic contexts, 3D analysis provides detailed characterization of vascular architecture and identifies early vulnerability of the intermediate layer plexus (IMP) as a sensitive indicator of abnormal remodeling and neovascularization. This framework enables precise characterization of retinal vasculature and establishes a foundation for identifying new retinal biomarkers with potential relevance to neurovascular and systemic disease.
Gargano, J. A.; Rice, A.; Chari, D. A.; Parrell, B.; Lammert, A. C.
Show abstract
Reverse correlation is a widely-used and well-established method for probing latent perceptual representations in which subjects render subjective preference responses to ambiguous stimuli. Stimuli are purposefully designed to have no direct relationship with the target representation (e.g., they are randomly-generated), a property which makes each individual stimulus minimally informative toward reconstructing the target, and often difficult to interpret for subjects. As a result, a large number of stimulus-response pairs must be gathered from a given subject in order for reconstructions to be of sufficient quality, making the task fatiguing. Recent work has demonstrated that the number of trials needed can be substantially reduced using a compressive sensing framework that incorporates the assumption that the target representation can be sparsely represented in some basis into the reconstruction process. Here, we introduce an alternative method that incorporates the sparsity assumption directly into stimulus generation, which holds promise not only for improving efficiency, but also for improving the interpretability of stimuli from subjects perspective. We develop this new method as a mathematical variation of the compressive sensing approach, before conducting one simulation study and two human subjects experiments to assess the benefits of this method to reconstruction quality, sample size efficiency, and subjective interpretability. Results show that sparse stimulus generation improves all three of these areas relative to conventional reverse correlation approaches, and also relative to compressive sensing in most conditions.
Pyne, S.; Wainwright, B.; Ali, M. H.; Lee, H.; Ray, M. S.; Senthil, S.; Jammalamadaka, S. R.
Show abstract
Progressive optic neuropathies, particularly glaucoma, represent a significant global health challenge, and the need for precise understanding of the heterogeneous neurodegenerative phenotypes cannot be overstated. Here, we brought together two complementary sources of unstructured yet clinically-relevant information about neurotinal rim (NRR) thinning, a common clinical marker of such decay. These are based on a new dataset of Fundus digital images and a corresponding one of optical coherence tomography, both collected from a large clinical cohort of healthy eyes. First, we represented them using a common data structure that imposed a high-resolution scale of 180 equally-spaced and registered measurements on a 360{degrees} circular axis. We modeled such NRR data-points of each eye as circular curves, and aligned these multimodal curves to obtain a fused NRR curve for each eye. Unsupervised clustering of these fused curves identified 4 clusters of eyes with structural heterogeneity, which were also found to have distinctive clinical covariates. The computation of functional derivatives revealed the troughs in the curves of each cluster. Using circular statistics, we estimated the directional distributions of such troughs as potentially clinically-relevant regions of NRR decay. We also demonstrated that multimodal fusion leads to improvement in the robustness of baseline NRR data obtained from fundus imaging.